Rank in Wordlist | Frequency | Word |
---|---|---|
2814 | 290 | 1,5 |
3052 | 263 | 2,5 |
5351 | 126 | 4,5 |
5416 | 124 | 7,6 |
5452 | 123 | 3,5 |
6366 | 99 | 1,2 |
6367 | 99 | 1,3 |
6368 | 99 | 1,6 |
7125 | 84 | 5,5 |
7499 | 78 | 1,4 |
Rank in Wordlist | Frequency | Word |
---|---|---|
35680 | 6 | A(H1N1 |
37045 | 6 | Pertamina)Singapura,(ANTARA |
42149 | 5 | cin(T)a |
47967 | 4 | ayat(4 |
50701 | 3 | 7-6(2 |
50702 | 3 | 7-6(4 |
50703 | 3 | 7-6(5 |
51857 | 3 | Cin(T)a |
51858 | 3 | Cin(t)a |
56859 | 3 | Wahyu)Jakarta,(ANTARA |
Rank in Wordlist | Frequency | Word |
---|---|---|
21772 | 15 | SBY)-Boediono |
27347 | 10 | JK)-Wiranto |
27560 | 10 | News)- |
34076 | 7 | Rekotomo)Jakarta |
37044 | 6 | Pertamina)New |
37045 | 6 | Pertamina)Singapura,(ANTARA |
42149 | 5 | cin(T)a |
46626 | 4 | Pertamina)@New |
47712 | 4 | Wahyu)Jakarta |
51857 | 3 | Cin(T)a |
Rank in Wordlist | Frequency | Word |
---|---|---|
82634 | 1 | 10%-nya |
88775 | 1 | 50%+1 |
90843 | 1 | 9%-10 |
91051 | 1 | 90%diantaranya |
Rank in Wordlist | Frequency | Word |
---|---|---|
22721 | 14 | S&P |
27689 | 10 | R&B |
27887 | 10 | WP&B |
30731 | 8 | D&B |
39221 | 5 | AT&T |
44818 | 4 | E&P |
55688 | 3 | S&P/Case-Shiller |
68077 | 2 | L&G |
68482 | 2 | MBH&Co |
71120 | 2 | R&D |
Rank in Wordlist | Frequency | Word |
---|---|---|
56696 | 3 | US$45 |
62538 | 2 | AS$300 |
62539 | 2 | AS$85 |
73892 | 2 | US$400 |
73893 | 2 | US$6 |
91670 | 1 | AS$1 |
91671 | 1 | AS$10 |
91672 | 1 | AS$25 |
127381 | 1 | U$13 |
127452 | 1 | US$1,75 |
Rank in Wordlist | Frequency | Word |
---|---|---|
57265 | 3 | baik,"kata |
57982 | 3 | facebook"nya |
58159 | 3 | ini,"kata |
58160 | 3 | ini,"katanya |
58161 | 3 | ini,"ujarnya |
58202 | 3 | itu,"katanya |
58607 | 3 | lainnya,"katanya |
59427 | 3 | pembeli,"kata |
60270 | 3 | tersebut,"kata |
60271 | 3 | tersebut,"katanya |
Rank in Wordlist | Frequency | Word |
---|---|---|
12679 | 35 | Eto'o |
20780 | 16 | L'Aquila |
20929 | 16 | Samuel Eto'o |
21580 | 15 | Jum'at |
21688 | 15 | O'Neill |
24817 | 12 | O'Shea |
26030 | 11 | Martin O'Neill |
29037 | 9 | John O'Shea |
30594 | 8 | Asy'ari |
33364 | 7 | Hasyim Asy'ari |
Rank in Wordlist | Frequency | Word |
---|---|---|
11459 | 41 | ASEAN+3 |
22477 | 14 | H+7 |
27585 | 10 | P5+1 |
33339 | 7 | H+2 |
33340 | 7 | H+3 |
36244 | 6 | H+5 |
52683 | 3 | H+10 |
61472 | 2 | 29+200 |
61854 | 2 | 5+1 |
61902 | 2 | 50+20 |
Rank in Wordlist | Frequency | Word |
---|---|---|
91606 | 1 | ANTARA/*)Jakarta |
Rank in Wordlist | Frequency | Word |
---|---|---|
1683 | 543 | kabupaten/kota |
5802 | 113 | capres/cawapres |
6282 | 101 | HIV/AIDS |
6379 | 99 | Kabupaten/Kota |
6615 | 94 | DPR/MPR |
7312 | 81 | Kido/Hendra |
7729 | 75 | TNI/Polri |
7767 | 74 | 30/9 |
8633 | 63 | 17/7 |
9237 | 57 | A/H1N1 |
In the last subsection of this type we look for words containing other special characters: , ( ) % & $
" ' + * = / _
Depending on the language some of these characters may be allowed within words, other will not. If words with forbidden characters do not have very low frequency there might be a problem in preprocessing.
Words containing %:
select w_id-100,freq, word from words where w_id>100 and word like "%\%%" limit 10;
3.12.1 Words with Hyphens
3.12.2 Multiwords
3.12.3 (Multi-)Words with dots